In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import random
Let's start with a simple homogenous Poisson process. A Poisson process is a continuous-time stochastic process that models the occurence of events within a given time interval. A Poisson process is known as a pure-birth process. The term homogenous refers to the fact that the rate of event occurences is constant over the time domain. The events are assumed to occur with an arrival time modeled by an exponential distribution.
The Poisson process is described by:
$$ P[(N(t + \Delta t) - N(t)) = k] = \frac{(\lambda \Delta t)^k}{k!}e^{-\lambda \Delta t}$$The above equations describes the probability of $k$ events occuring in the interval $(t, t + \Delta t]$ given $\lambda$, the rate of event occurences given in units of events/$\Delta t$.
I'll mention two approaches for generating events from the Poisson process.
The first approach is similar to the Roulette Wheel sampling method discussed early on. We can build the cumulative density function (CDF) by computing the probabilities for $k = 0, 1, 2, \dots$ events occuring in the time interval $(t, t + \Delta t]$. We then generate a random number from a uniform distribution and find the number of events by finding the matching range of probabilities. Since $k$ could be infinite, we only generate probabilities until we have a match.
There are two potential problems with this approach:
As a result, this approach ends up being computationally intensive.
For the second approach, we can compute the waiting time $X_n$ of event $n$ by sampling from the exponential distribution:
$$ f(X_n) = (\lambda \Delta t)^k e^{-\lambda \Delta t} $$The arrival time of $n$ is the sum of all the previous waiting times:
$$ t_n = \sum_{i=1}^n X_i $$The second approach overcomes both disadvantages of the first approach. It gives exact arrival times and is computationally efficient.
(I am unsure of whether the second approach will hold for non-homogenous Poisson processes.)
In [2]:
class HomogenousPoissonProcess(object):
def __init__(self, rate=None):
self._rate = rate
self._last_arrival = 0.0
self._last_time_period = 0.0
def sample(self, time_period):
arrival_times = []
if self._last_arrival > self._last_time_period:
arrival_times.append(self._last_arrival)
self._last_time_period += time_period
while True:
waiting_time = random.expovariate(self._rate)
self._last_arrival += waiting_time
if self._last_arrival < self._last_time_period:
arrival_times.append(self._last_arrival)
else:
break
return arrival_times
In [3]:
rate = 10.0 # 10 events per hour
process = HomogenousPoissonProcess(rate=rate)
arrival_times = process.sample(24.0)
print "%s events" % len(arrival_times)
print "Expected: %s events/hour" % rate
print "Observed: %s events/hour" % (len(arrival_times) / 24.0)
plt.plot(xrange(1, len(arrival_times) + 1), arrival_times)
plt.xlabel("Events")
plt.ylabel("Arrival Times")
plt.title("Homogenous Poisson Process")
Out[3]:
In [3]: